Evaluating Translational Correspondence using Annotation Projection

نویسندگان

  • Rebecca Hwa
  • Philip Resnik
  • Amy Weinberg
  • Okan Kolak
چکیده

Recently, statistical machine translation models have begun to take advantage of higher level linguistic structures such as syntactic dependencies. Underlying these models is an assumption about the directness of translational correspondence between sentences in the two languages; however, the extent to which this assumption is valid and useful is not well understood. In this paper, we present an empirical study that quantifies the degree to which syntactic dependencies are preserved when parses are projected directly from English to Chinese. Our results show that although the direct correspondence assumption is often too restrictive, a small set of principled, elementary linguistic transformations can boost the quality of the projected Chinese parses by 76% relative to the unimproved baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Projector: An Interactive Annotation Projection Visualization Tool

Previous works proposed annotation projection in parallel corpora to inexpensively generate treebanks or propbanks for new languages. In this approach, linguistic annotation is automatically transferred from a resource-rich source language (SL) to translations in a target language (TL). However, annotation projection may be adversely affected by translational divergences between specific langua...

متن کامل

Translational Equivalence and Cross-lingual Parallelism: The Case of FrameNet Frames

Annotation projection is a strategy for the cross-lingual transfer of annotations which can be used to bootstrap linguistic resources for low-density languages, such as role-semantic databases similar to FrameNet. In this paper, we investigate the main assumption underlying annotation projection, cross-lingual parallelism, which states that annotation is parallel across languages. Concentrating...

متن کامل

Cross-lingual Abstract Meaning Representation Parsing

Abstract Meaning Representation (AMR) annotation efforts have mostly focused on English. In order to train parsers on other languages, we propose a method based on annotation projection, which involves exploiting annotations in a source language and a parallel corpus of the source language and a target language. Using English as the source language, we show promising results for Italian, Spanis...

متن کامل

Evaluating Three Image Segmentation Algorithms from Two Perspectives: Segmentation Error Measures and Image Annotation

Image segmentation has an essential role in the image annotation process which assigns meaningful words to an image taking into account its content. For this reason it is important to identify which segmentation algorithm is producing better results. This evaluation can be made using segmentation error measures for consistency quantification and by analyzing the results of the annotation proces...

متن کامل

Analysis of Translational Correspondence in view of Sub-sentential Alignment

This paper reports on the first results of an empirical study of translational correspondence in different text types for the English-Dutch language pair. A Gold Standard was created, which can be used as a standard data set for evaluating subsentential alignment. The manually indicated translational correspondences were analyzed in view of different heuristics used in existing sub-sentential a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002